-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MON-2693: Scrape profiles #1785
MON-2693: Scrape profiles #1785
Conversation
JoaoBraveCoding
commented
Sep 30, 2022
- I added CHANGELOG entry for this change.
- No user facing changes, so no entry in CHANGELOG was needed.
00bdceb
to
40abf80
Compare
|
||
p.Spec.ServiceMonitorSelector = labelSelector | ||
p.Spec.PodMonitorSelector = labelSelector | ||
p.Spec.ProbeSelector = labelSelector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to leave ProbeSelector untouched as it's always nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMO would need to deploy all new service monitors so they can be picked up by Prometheus?
43492b3
to
b5c83dd
Compare
b5c83dd
to
5520644
Compare
@JoaoBraveCoding it's ok to undraft it, isn't it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start! Should we first implement full & operational profiles only?
3dd4089
to
4aa35ab
Compare
a277feb
to
e604174
Compare
/retest-required |
- action: labeldrop | ||
regex: instance | ||
- action: keep | ||
regex: (kube_daemonset_status_current_number_scheduled|kube_daemonset_status_desired_number_scheduled|kube_daemonset_status_number_available|kube_daemonset_status_number_misscheduled|kube_daemonset_status_updated_number_scheduled|kube_deployment_metadata_generation|kube_deployment_spec_replicas|kube_deployment_status_observed_generation|kube_deployment_status_replicas_available|kube_deployment_status_replicas_updated|kube_horizontalpodautoscaler_spec_max_replicas|kube_horizontalpodautoscaler_spec_min_replicas|kube_horizontalpodautoscaler_status_current_replicas|kube_horizontalpodautoscaler_status_desired_replicas|kube_job_failed|kube_job_status_active|kube_job_status_start_time|kube_node_info|kube_node_labels|kube_node_role|kube_node_spec_taint|kube_node_spec_unschedulable|kube_node_status_allocatable|kube_node_status_capacity|kube_node_status_condition|kube_persistentvolume_info|kube_persistentvolume_status_phase|kube_persistentvolumeclaim_access_mode|kube_persistentvolumeclaim_info|kube_persistentvolumeclaim_labels|kube_persistentvolumeclaim_resource_requests_storage_bytes|kube_pod_container_resource_limits|kube_pod_container_resource_requests|kube_pod_container_status_last_terminated_reason|kube_pod_container_status_restarts_total|kube_pod_container_status_waiting_reason|kube_pod_info|kube_pod_owner|kube_pod_status_phase|kube_pod_status_ready|kube_pod_status_unschedulable|kube_poddisruptionbudget_status_current_healthy|kube_poddisruptionbudget_status_desired_healthy|kube_poddisruptionbudget_status_expected_pods|kube_replicaset_owner|kube_replicationcontroller_owner|kube_resourcequota|kube_state_metrics_list_total|kube_state_metrics_watch_total|kube_statefulset_metadata_generation|kube_statefulset_replicas|kube_statefulset_status_current_revision|kube_statefulset_status_observed_generation|kube_statefulset_status_replicas|kube_statefulset_status_replicas_ready|kube_statefulset_status_replicas_updated|kube_statefulset_status_update_revision|kube_storageclass_info|process_start_time_seconds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kube_node_labels
is kept, but no
kube_pod_labels
kube_namespace_labels
kube_poddisruptionbudget_labels
kube_persistentvolume_labels
kube_persistentvolumeclaim_labels
we have following bugs to keep the above metrics
https://bugzilla.redhat.com/show_bug.cgi?id=2011698
https://bugzilla.redhat.com/show_bug.cgi?id=2015386
https://bugzilla.redhat.com/show_bug.cgi?id=2018431
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added these but I'll have to investigate why the tool I developed didn't pick up these, I'll open an issue on the project and add a task to the GA epic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay looking quickly at the bugs these seem to be metrics that are not used in our default alerting so it's normal that they got excluded from the list. The minimal profile is very restrictive and should only contain metrics that are essential to default alerts, default rules, console and telemetry
Note that kube_persistentvolumeclaim_labels
it's already on the list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also searched on the CMO repo for those metrics to double check and they are not used, so from my POV things are working correctly and those metrics should be excluded (except kube_persistentvolumeclaim_labels
, which was already on the list). But do let me know if I missed something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, it makes sense
26767f7
to
136051a
Compare
136051a
to
f9b351b
Compare
/label qe-approved |
/retest-required |
f9b351b
to
ff3b003
Compare
Signed-off-by: JoaoBraveCoding <jmarcal@redhat.com>
ff3b003
to
ad0a06d
Compare
/retest-required |
2 similar comments
/retest-required |
/retest-required |
/test e2e-agnostic |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jan--f, JoaoBraveCoding, simonpasquier The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest-required |
@JoaoBraveCoding: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |